An Ising-like model for protein mechanical unfolding 
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The mechanical unfolding of proteins is investigated by extending the Wako-Saito-Munoz-Eaton model, a 
simplified protein model with binary degrees of freedom, which has proved successful in describing the kinetics 
of protein folding. Such a model is generalized by including the effect of an external force, and its thermody- 
namics turns out to be exactly solvable. We consider two molecules, the 27th immunoglobulin domain of titin 
and protein PIN1. In the case of titin we determine equilibrium force-extension curves and study nonequilib- 
rium phenomena in the frameworks of dynamic loading and force clamp protocols, verifying theoretical laws 
and finding the position of the kinetic barrier which hinders the unfolding of the molecule. The PIN1 molecule 
is used to check the possibility of computing the free energy landscape as a function of the molecule length by 
means of an extended form of the Jarzynski equality. 



PACS numbers: 87.15.Aa, 87.15.He, 87.15.-v 
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Manipulation experiments on single biomolecules have 
greatly increased our knowledge of the structural properties 
of such molecules. In a typical experiment a controlled force 
is applied to one of the free ends of the molecule, and the in- 
duced elongation is measured. Such experimental techniques 
have been used to probe the structure of proteins HI 0] and 
nucleic acids JUHl. According to the common interpretation 
the unfolding of a molecule being pulled from one of its ends 
is hindered by kinetic barriers associated with the strongest 
linkages which serve to stabilize the molecular structure. The 
breaking of a molecular bond can thus be viewed as the over- 
coming of a kinetic barrier. It has been argued |5] that the 
study of the kinetics of bond breaking under different loading 
rates can provide much information about the internal struc- 
ture of molecules, and in particular allows one to measure the 
strength of the molecular bonds, and to associate to them a 
position along the molecular structure. In the case of sim- 
ple molecules, such as RNA hairpins, it is easy to obtain in- 
formation on the molecular structure by pulling experiments 
J3I]. However, when one deals with large molecular struc- 
tures, such as multi-domain proteins, the inference of struc- 
tural characteristics from the unfolding kinetics can prove a 
difficult task. Therefore, the study of simple models for the 
unfolding of proteins, whose microscopic native structures are 
known a priori, is highly desirable: investigating the kinetics 
of such models can shed new light on the relations between the 
experimentally observed unfolding features and the molecular 
structures. 

The experiments discussed above are usually performed in 
non-equilibrium conditions: because of technical limitations 
the pulling process is faster than the typical molecular relax- 
ation time. The problem of irreversibility of unfolding pro- 
cesses can be avoided by using the remarkable equality intro- 
duced by Jarzynski which allows one to measure the free 
energy difference between the folded and the unfolded state of 
a biomolecule |7J]. By using an extended form of the Jarzyn- 
ski equality (JE) the free energy landscape of simple models 
of biopolymers has been probed as a function of the molecu- 
lar elongation . Although this approach appears very 



promising, it still has to be tested on systems of increasing 
complexity. 

Here we approach the mechanical unfolding problem by 
means of a suitable generalization of the Wako-Saito-Munoz- 
Eaton (WSME) protein folding model This is a 
simplified statistical mechanical model where a binary vari- 
able is associated to each peptide bond. The equilibrium ther- 
modynamics has been solved exactly ifTil [l5ll and the model 
has been quite successful in describing the kinetics of protein 
folding §16, 17, Til [lpl and has also found applications in 
different fields (see [20] and references therein). 

In the present paper we first extend the WSME model by 
considering the effect of an external force, and show that the 
equilibrium properties of this new model can be computed 
exactly, similarly to the original WSME model. In order to 
mimic the mechanical unfolding of proteins, we use computer 
simulations and study the unfolding kinetics of our model, 
both in the cases of constant force and dynamic loading. Fi- 
nally we probe the free energy landscape of a model protein, 
exploiting the JE. 

The WSME model describes a protein of N + I aminoacids 
as a chain of N peptide bonds (connecting consecutive 
aminoacids) that can live in two states (native and unfolded) 
and can interact only if they are in contact in the native struc- 
ture and if all bonds in the chain between them are native. 
To each bond is associated a binary variable ra^, with values 
0, 1 for unfolded and native state respectively. The effective 
Hamiltonian of the model reads 

N-l N j N 

1=1 j=i+\ k=i i=l 

where kg is the Boltzmann constant, and T is the absolute 
temperature. The first term assigns an energy ey < to 
the contact (defined as in [1411 ) between bonds i and j 
if this takes place in the native structure (Ay = 1 in this 
case and Ay = otherwise). The second term represents 
the entropic cost g, > of ordering bond i. In Ref. 111411 
it is shown how to compute exactly the partition function 
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Z = Yi{m k \ ex Pl~f^o(i m k})] an d the corresponding thermal 
averages. Here and in the following, the quantity /3 indicates 
the inverse temperature /3 = l/iksT). 

In order to couple the protein to an external force we as- 
sume that a configuration {m^} of the model defines a sequence 
of native stretches, separated by unfolded peptide bonds (see 
inset of Fig.Q]). Peptide bonds i and j delimit a native stretch 
if and only if to, = mj — and m k = 1 for i < k < j. A 
native stretch is regarded as a rigid portion of the molecule, 
even under application of the external force, with an end-to- 
end length Uj. If j = i + 1 the stretch reduces to a single 
aminoacid, which is also regarded as a rigid structure, with 
length Z, The values of the parameters /y are taken from 
the native structure of the protein 112 ill . Boundary conditions 
are introduced through the dummy bonds toq = m^ + \ = 0. 
Given the direction of the external force, we assume that a na- 
tive stretch, or a single aminoacid delimited by two successive 
unfolded bonds, can only take two orientations, parallel or an- 
tiparallel to the force, so that they contribute ±/y to the length 
of the molecule. Therefore, given a configuration {m k } of the 
peptide bonds, we introduce a variable cry for each rigid por- 
tion of the molecule, either a native stretch or an aminoacid 
delimited by two successive unfolded bonds, taking values +1 
(rigid portion parallel to the force) or -1 (antiparallel). Thus 
the end-to-end length of the molecule, in the force direction, 
reads 



L({m k ], {o-ij}) 
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Let us define the Hamiltonian <H as the sum of the interac- 
tion energy term, contained in the effective Hamiltonian fio 
(UJ, and of the term -fL, which takes into account the effect 
of the external force t H({m k ), {cry},/) = £i<j eyAy ]~[ J k=i m ~ 
fL({m k }, {(Tij)). Note that the definition of molecular length 
(|2j is such that the set of variables {cry} is dynamically de- 
fined by the bond configuration {m k }\ for each configura- 
tion {ntk}, we consider only those variables cry such that 
(1 - to,)(1 - to 7 )to, + ito, + 2 . . . to j- 1 = 1. One can easily sum 
over the variables {cry}: £ {try) exp[-f¥H({m k } , {cry},/)] = 
exp[- J 0'7Y e ff({TOi} ,/)], where the new effective Hamiltonian 
•74ff is given by 



m k 



i=l j=i+l k=i 

-k B T In [2 cosh (pfhj)] (1 - to,)(1 - mj) ]~[ m k ,(3) 



k=M 



and, as a function of {m k }, has the same structure as the 
WSME model (see Eq. (fl3). Therefore its equilibrium ther- 
modynamics is exactly solvable, as discussed in Ref. Ill 411 . In 
the case / = 0, "Hes reduces to Eq. (UJ (up to an additive con- 
stant), with qi = In 2. Thus, the effective entropic terms can 
be viewed as resulting from microscopic orientational degrees 
of freedom of the native stretches. 



In the following, the quantity e will indicate the system en- 
ergy scale. This quantity, together with the interaction ener- 
gies ey, will be chosen as in |l3, 14, 18], see also |21[. We 
also introduce the reduced temperature in terms of the energy 
scale: T = k^T je . The time scale will be indicated by frj. 

Following Ref. [14], for any choice of the model param- 
eters, one can calculate the equilibrium value of the order 
parameter to = l/N Y*k( m k) and of the molecule length, 
as defined by Eq. (f2j), for varying force and temperature. 
We first consider the titin immunoglobulin domain 127 (89 
aminoacids, pdb code 1TIT), which has been widely studied 
both experimentally [1] and theoretically 12211 . The energy 
scale for such a molecule is taken to be e/fcg = 43.1 K, while 
the melting temperature is T m =* 8.03 12 ill . In figureQ] the root 
mean square length of the 1TIT molecule is plotted as a func- 
tion of the external force / for different temperatures. The 
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FIG. 1: Root mean square length of the 1TIT molecule as a function 
of the external force /, for different temperatures (L is defined by 
Eq. 12J). Inset: Cartoon of the model protein under loading. 

plateau appearing in Fig. [T| in the low temperature regime, 
corresponds to the overall alignment of the molecule in the na- 
tive configuration (to 1) to the force direction. Having intro- 
duced the molecular length we have built a framework within 
which the mechanically induced protein unfolding can be sim- 
ulated by applying an external force. In the following, two 
manipulation schemes will be considered: in the first one the 
molecule is manipulated in a "force-clamp", where a sudden 
force is applied to one of the molecule's free ends, in the sec- 
ond one a time-varying force is applied, so that the load on the 
molecule increases gradually. In order to study the molecu- 
lar unfolding, we run Monte Carlo simulations with Metropo- 
lis kinetics using the Hamiltonian , H({m k } , {cry}, /(f)). In the 
following the time scale fo will correspond to a single Monte 
Carlo step. In the force clamp manipulation experiments, the 
molecule unfolds after a given time r u which fluctuates be- 
tween one realization of the unfolding process and the other, 
due to the stochasticity of the unfolding process. Unfolding 
can be viewed as an activated process [23], whose kinetics 
is dominated by a characteristic energy barrier AE U placed at 
the value x u of the reaction coordinate: therefore, it is usu- 
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ally assumed that the average unfolding time (t u ) follows an 
Arrhenius law (t u ) = a»T exp {/3(AE U - fx u )], where ljq is a 
characteristic attempt rate depending on the microscopic fea- 
tures of the system. 

We simulate force clamp manipulations of the 1TIT 
molecule as follows. Starting from thermal equilibrium with 
/ = 0, at t = we apply a non-zero force / and thus mea- 
sure the unfolding time t u as the first passage time of the 
molecule length across the threshold value L u , defined as half 
the molecule maximal length, L u = 140 A, see figure Q] In 
figure [2]the unfolding time r u of the 1TIT molecule, averaged 
over 1000 independent unfolding trajectories, is plotted as a 
function of the applied force /, for three values of the temper- 
ature T = 4, 6, 8. From a fit of the data shown in Fig. d2J to the 
Arrhenius law, we find that the unfolding length is x u 3 A, 
and is independent of the temperature, as expected (see cap- 
tion of fig. (fJJ for the exact values and uncertainties). Note 
that this value is in good agreement with the experimentally 
measured value of the titin unfolding length x u = 2.5 A, found 
inRef. fl. 
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FIG. 2: Average unfolding time of the 1TIT molecule as a function of 
the force for three values of the temperature. The lines are linear fits 
of the data to the Arrhenius law discussed in the text. From such fits 
we find the following values of the unfolding length (in A): T = 4, 
x u = 3.4 + 0.1; T = 6, x a = 3.2 + 0.1; T = 8, x a = 3.0 + 0.1. 

We now consider the case where a time-dependent force 
is applied to our model molecule and the unfolding time is 
sampled over 1000 independent trajectories. Here the force 
increases linearly with time, with a rate r, and thus the rupture 
force /* is given by /* = rr u , where unfolding time is defined 
as in the case of the force clamp. In refs. [23] it has been 
argued that, if the energy barrier AE U is large (compared to the 
thermal energy kgT) and rebinding is negligible, the typical 
unbinding force of a single molecular bond under dynamic 
loading is given by 

/* = k B T/x u ln\J3rx u T ] (4) 

where to is the characteristic unfolding time at zero force, 
to = io~Q l exp \J3AE,,]. In Fig. [3] the breaking force /* is 
plotted as a function of the pulling velocity, for the 1TIT 



molecule, for three values of the temperature. The value of 
the unfolding length obtained by fitting the data to Eq. (O 
is x u =i 3 A, and is independent of the temperature, as ex- 
pected (see caption of fig.f3]for the exact values). Note that 
this value agrees with that found with the force clamp ma- 
nipulation, and with the experimental value x u — 2.5 A found 
in Ref. On the other hand, in recent works lf24ll . it has 
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FIG. 3: Plot of the breaking force /* of the 1TIT molecule as a func- 
tion of the pulling velocity r, for the three values of temperature here 
considered. The lines are fits to the data in the linear regime defined 
by Eq. l[4j. From such fits we obtain x u = 3.1 ± 0.1 A for T = 4, 
x a = 3.0 ± 0.1 A for T = 6, and x u = 3.1 ± 0.2 A for T = 8. Inset: the 
lines are fits of the data to the equation defining /*, see text. 

been argued that the rupture force /* has a more complex 
expression/* = AE u /(vx u ) {1 - [-IcbT /AE U In (J3rx u Toe~ y )] v }, 
where the exponent v depends on the microscopic details of 
the energy landscape, and y is the Euler-Mascheroni constant 
y - 0.577. This equation reduces to Eq. (0| in the case v = 1 
or in the limit AE„ — > oo [24]. In the inset of Fig. [3] we plot 
the fits of the rupture force data to the equation defining /*. 
Although the agreement of the data with the equation appears 
to be rather good, the statistical errors of the fit parameters 
are quite large, since /* depends nonlinearly on the set of the 
unknown parameters. This issue will be addressed in a forth- 
coming paper. 

We now evaluate the effective free energy landscape as a 
function of the molecular length L of the extended model here 
discussed. Formally, the free energy function F(L) is defined 
by F(L) = -ksT In Z(L), where Z(L) is given by the sum of 
the Boltzmann weight exp[-/?H({m J t} , {cry},/ = 0)], over all 
those configurations {m^}, {op}, whose length L({mic}, {cr !; }) 
(Eq. (f2]l) equals the given value L. It can be shown, that 
the partition function Z(L) is related to the work done on the 
molecule during the manipulation via the extended JE [8] 

Z(L) = (8(L - L(x,)) exp (-/3W,)) exp(-J3f(t)L) (5) 

where x, is the system microscopic configuration at time t, 
L(x) is the macroscopic length corresponding to microscopic 
configuration x, W, is the thermodynamic work done on the 
system by the external potential, up to the time f, defined by 
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FIG. 4: Reconstructed free energy landscape F of the PIN1 molecule 
(lines), as_a function of L, for different pulling rates r (in pN//o units) 
and with T = 6. Circles: expected value of F(L) as obtained by direct 
evaluation of the free energy function. Inset: Plot of F(L) - fL, for 
/ = 2e/A. The new minimum at L =s 120 A corresponds to the 
length of the molecule in the large force regime (data not shown). 
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W, = J Q dt'&Hldt ', and the average is over all the trajectories 
of fixed duration t. In order to recover the partition function 
Z(L) from eq. © we use the procedure introduced and dis- 
cussed in ref. Q8|] . The investigation of the free energy land- 
scape of the 1TIT molecule, by using Eq. ||5}, turned out to 
be a very difficult task. Indeed, the typical value of the work 
associated to the unfolding of this molecule is of the order of 
some hundreds of k B T, for the value of e here used. Since one 
has to evaluate exp(-y8W,), in order to exploit Eq. (|5), the JE 
cannot give a reliable estimate of the energy landscape of the 
1TIT molecule. For a discussion of the range of applicability 
of the JE to microscopic systems see, e.g., 112511 . Therefore we 
consider the PIN1, a smaller protein whose folding character- 
istics have already been studied with the WSME model [18] 
(pdb code 1I6C, 39 aminoacids, e/k B = 44 K Its free 

energy landscape as a function of the molecule elongation L, 
as given by Eq. (0, is plotted in Fig.|4]for different velocities 
of the pulling protocol. It can be seen that the curves F(L) col- 
lapse onto the same curve, as the pulling velocity is decreased. 
This is a clear signature that the energy landscape is correctly 
reconstructed, and its best estimate is the collapse curve, as 
discussed in Ref. ifioll . On the other hand, the model intro- 
duced here is simple enough to allow the exact computation 
of the function Z(L), and hence we can obtain the exact value 
of the function F (L): the agreement with the landscape eval- 
uated by the pulling manipulations is found to be very good, 
see Fig. [4] 

In conclusion, we have introduced and studied a model of 
proteins under external loading. The unfolding length of the 
titin model is found to be in good agreement with the ex- 
perimental one. We believe that this result represents a re- 
markable validation of the model that we have introduced: it 
suggests that our model, although minimal, captures the ba- 
sic mechanisms underlying the unfolding process of proteins. 



For a small protein, the extended form of the Jarzynski equal- 
ity gives an estimate of the free energy landscape which is in 
good agreement with the expected one. We believe that our 
model can be successfully used to study the interplay between 
the protein structures and the kinetics of unfolding and refold- 
ing under external loading. As an example, by using computer 
simulations and applying an external force, one can easily de- 
termine which are the main contacts stabilizing the molecular 
structures. Furthermore, our model is also suitable to study 
the thermal unfolding of the proteins, in order to compare the 
thermal and the mechanical unfolding paths. 
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Appendix to "An Ising-like model for 
protein mechanical unfolding" 

In this appendix we briefly review the definition of the pa- 
rameters e, ey, Ajj in the WSME model, and discuss the def- 
inition of the new parameters Zy which have been introduced 
in the main text. 

The parameters Ay and ey, appearing in eqs. (l)-(3) of the 
main text are chosen following Ref. [ 1 ], starting from the pro- 
tein native structure, as given in the Protein Data Bank (pdb 
in the following, http://www.pdb.org/). An atomic contact is 
present (Ay = 1) if, in the native state of the protein, at least 
two atoms from residues i and j + 1 (with j + 1 > i + 2) are 
closer than 4 A. In this case ey is taken to be equal to ke, 
where k is an integer such that 5(k - 1) < n at < 5k, and n at 
is the number of atomic contacts. As an example, in table [I] 
the values of the contact parameter k, defined as k = ey/e are 
listed for the 1TIT molecule. 

The lengths Zy, in a generic N + I aminoacid protein, are 
defined as follows. Let us represent the aminoacid ; with its 
Ni - C a j - Ci sequence. Taking the native state as the refer- 
ence configuration, Zy is chosen as the distance between the 
midpoint of the C, and Nt+i atoms and the midpoint of the C; 
and Nj+i atoms, see figure [5] If j = i + 1, /, ,+, corresponds 
to the length of a single aminoacid. If i = the first point 
is substituted with the position of the N\ atom, while when 
j = N + 1 we take the position of the Cn+i atom. 

In order to fix the energy scale e, we define the dimension- 
less unfolding temperature of our model T m , at zero force, 
as that temperature where the molecule order parameter m — 
1/NYik ( m k) is equal to 1/2. For a given molecule, using the 
experimental value of the melting temperature T m , defined as 
the temperature where half of the sample is unfolded, we de- 
fine the energy scale as e = T„,kB/T m . This approach was used 
in Ref s. fill. 

Here and in the main text we indicate the proteins either 
with their common name or with their pdb code. 

We find that the midpoint temperature of the 1TIT model 
molecule, at zero force, takes the value T - 8.03. The exper- 
imentally measured unfolding temperature for such a protein 
is T m =s 346 K U4J]. Consequently, we choose the energy scale 
to be e/k B =43.1 K. 

The midpoint temperature of the PIN1 molecule (pdb code 
1I6C), at zero force, is found to be T - 7.55. The experi- 
mentally measured unfolding temperature for such a protein 
is T,„ =i 332 K |@]. Consequently, we choose the energy scale 
to be e/k B - 44 K. 



[3] Zamparo M., Pelizzola A. (2006) Phys. Rev. Lett. 97:068106. 
[4] A. S. Politou, D. J. Thomas, and A. Pastore (1995) Biophys. J. 
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TABLE I: Map of the interaction parameter k, defined as k = e, ; /e, 
for the 1TIT protein. 
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FIG. 5: Cartoon of the model protein native structure. The lengths / j; -s are defined as discussed in the section Model Parameters. 



